AITopics | gpt-3 model

Collaborating Authors

gpt-3 model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Assessing how hyperparameters impact Large Language Models' sarcasm detection performance

Gole, Montgomery, Miranskyy, Andriy

arXiv.org Artificial IntelligenceApr-17-2025

Sarcasm detection is challenging for both humans and machines. This work explores how model characteristics impact sarcasm detection in OpenAI's GPT, and Meta's Llama-2 models, given their strong natural language understanding, and popularity. We evaluate fine-tuned and zero-shot models across various sizes, releases, and hyperparameters. Experiments were conducted on the political and balanced (pol-bal) portion of the popular Self-Annotated Reddit Corpus (SARC2.0) sarcasm dataset. Fine-tuned performance improves monotonically with model size within a model family, while hyperparameter tuning also impacts performance. In the fine-tuning scenario, full precision Llama-2-13b achieves state-of-the-art accuracy and $F_1$-score, both measured at 0.83, comparable to average human performance. In the zero-shot setting, one GPT-4 model achieves competitive performance to prior attempts, yielding an accuracy of 0.70 and an $F_1$-score of 0.75. Furthermore, a model's performance may increase or decline with each release, highlighting the need to reassess performance after each release.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2504.06166

Country:

North America > Canada (0.28)
North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.46)

Industry:

Media (0.66)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Review for NeurIPS paper: Language Models are Few-Shot Learners

Neural Information Processing SystemsJan-21-2025, 22:32:13 GMT

Strengths: The paper in one of these research works that are simple conceptually (training a very large language model at scale) yet ground-breaking (redefines what we thought was possible). The amount of work behind this is enormous and the combination of simplicity, strong engineering work and new discovery makes it a very enjoyable paper to read. I have of course particularly enjoyed reading the part on the distinction of zero-/one-/few-shot learning and seeing the incredible capacity of the GPT-3 model. The fact that a very big neural net can perform a language task without any finetuning is definitely novel and in my opinion unforeseen. This takes us much closer to a system capable of performing multiple tasks at once with little to no supervision - as humans - and reveals a hint of what will be possible in the *near* future with large-scale self-supervised techniques, possibly combined with multiple modalities.

few-shot learner, language model, neurips paper, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.61)

Add feedback

Optimization Techniques for Sentiment Analysis Based on LLM (GPT-3)

Zhan, Tong, Shi, Chenxi, Shi, Yadong, Li, Huixiang, Lin, Yiyu

arXiv.org Artificial IntelligenceMay-15-2024

With the rapid development of natural language processing (NLP) technology, large-scale pre-trained language models such as GPT-3 have become a popular research object in NLP field. This paper aims to explore sentiment analysis optimization techniques based on large pre-trained language models such as GPT-3 to improve model performance and effect and further promote the development of natural language processing (NLP). By introducing the importance of sentiment analysis and the limitations of traditional methods, GPT-3 and Fine-tuning techniques are introduced in this paper, and their applications in sentiment analysis are explained in detail. The experimental results show that the Fine-tuning technique can optimize GPT-3 model and obtain good performance in sentiment analysis task. This study provides an important reference for future sentiment analysis using large-scale language models.

language model, language processing, sentiment analysis, (14 more...)

arXiv.org Artificial Intelligence

2405.0977

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Narrating Causal Graphs with Large Language Models

Phatak, Atharva, Mago, Vijay K., Agrawal, Ameeta, Inbasekaran, Aravind, Giabbanelli, Philippe J.

arXiv.org Artificial IntelligenceMar-11-2024

The use of generative AI to create text descriptions from graphs has mostly focused on knowledge graphs, which connect concepts using facts. In this work we explore the capability of large pretrained language models to generate text from causal graphs, where salient concepts are represented as nodes and causality is represented via directed, typed edges. The causal reasoning encoded in these graphs can support applications as diverse as healthcare or marketing. Using two publicly available causal graph datasets, we empirically investigate the performance of four GPT-3 models under various settings. Our results indicate that while causal text descriptions improve with training data, compared to fact-based graphs, they are harder to generate under zero-shot settings. Results further suggest that users of generative AI can deploy future applications faster since similar performances are obtained when training a model with only a few examples as compared to fine-tuning via a large curated dataset.

causality, dataset, graph, (15 more...)

arXiv.org Artificial Intelligence

2403.07118

Country:

North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > Ohio > Butler County > Oxford (0.04)
North America > Canada > Ontario > Toronto (0.04)
North America > Canada > Ontario > Thunder Bay (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.57)

Add feedback

Is ChatGPT a game changer for geocoding -- a benchmark for geocoding address parsing techniques

Yin, Zhengcong, Li, Diya, Goldberg, Daniel W.

arXiv.org Artificial IntelligenceDec-15-2023

The remarkable success of GPT models across various tasks, including toponymy recognition motivates us to assess the performance of the GPT-3 model in the geocoding address parsing task. To ensure that the evaluation more accurately mirrors performance in real-world scenarios with diverse user input qualities and resolve the pressing need for a 'gold standard' evaluation dataset for geocoding systems, we introduce a benchmark dataset of low-quality address descriptions synthesized based on human input patterns mining from actual input logs of a geocoding system in production. This dataset has 21 different input errors and variations; contains over 239,000 address records that are uniquely selected from streets across all U.S. 50 states and D.C.; and consists of three subsets to be used as training, validation, and testing sets. Building on this, we train and gauge the performance of the GPT-3 model in extracting address components, contrasting its performance with transformer-based and LSTM-based models. The evaluation results indicate that Bidirectional LSTM-CRF model has achieved the best performance over these transformer-based models and GPT-3 model. Transformer-based models demonstrate very comparable results compared to the Bidirectional LSTM-CRF model. The GPT-3 model, though trailing in performance, showcases potential in the address parsing task with few-shot examples, exhibiting room for improvement with additional fine-tuning. We open source the code and data of this presented benchmark so that researchers can utilize it for future model development or extend it to evaluate similar tasks, such as document geocoding.

address component, dataset, variation, (14 more...)

arXiv.org Artificial Intelligence

2310.1436

Country:

North America > United States > Texas > Brazos County > College Station (0.15)
North America > United States > Texas > Harris County > Houston (0.05)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
(6 more...)

Genre: Research Report > Promising Solution (0.40)

Industry:

Health & Medicine > Therapeutic Area (0.48)
Health & Medicine > Public Health (0.46)
Health & Medicine > Epidemiology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On Sarcasm Detection with OpenAI GPT-based Models

Gole, Montgomery, Nwadiugwu, Williams-Paul, Miranskyy, Andriy

arXiv.org Artificial IntelligenceDec-7-2023

Sarcasm is a form of irony that requires readers or listeners to interpret its intended meaning by considering context and social cues. Machine learning classification models have long had difficulty detecting sarcasm due to its social complexity and contradictory nature. This paper explores the applications of the Generative Pretrained Transformer (GPT) models, including GPT-3, InstructGPT, GPT-3.5, and GPT-4, in detecting sarcasm in natural language. It tests fine-tuned and zero-shot models of different sizes and releases. The GPT models were tested on the political and balanced (pol-bal) portion of the popular Self-Annotated Reddit Corpus (SARC 2.0) sarcasm dataset. In the fine-tuning case, the largest fine-tuned GPT-3 model achieves accuracy and $F_1$-score of 0.81, outperforming prior models. In the zero-shot case, one of GPT-4 models yields an accuracy of 0.70 and $F_1$-score of 0.75. Other models score lower. Additionally, a model's performance may improve or deteriorate with each release, highlighting the need to reassess performance after each release.

dataset, gpt-3, sarcasm, (14 more...)

arXiv.org Artificial Intelligence

2312.04642

Country:

North America > Canada > Ontario > Toronto (0.04)
North America > United States > Washington > King County > Seattle (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre:

Research Report > Experimental Study (0.95)
Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.42)

Add feedback

InterPrompt: Interpretable Prompting for Interrelated Interpersonal Risk Factors in Reddit Posts

Sathvik, MSVPJ, Sarkar, Surjodeep, Saxena, Chandni, Sohn, Sunghwan, Garg, Muskan

arXiv.org Artificial IntelligenceNov-21-2023

Mental health professionals and clinicians have observed the upsurge of mental disorders due to Interpersonal Risk Factors (IRFs). To simulate the human-in-the-loop triaging scenario for early detection of mental health disorders, we recognized textual indications to ascertain these IRFs : Thwarted Belongingness (TBe) and Perceived Burdensomeness (PBu) within personal narratives. In light of this, we use N-shot learning with GPT-3 model on the IRF dataset, and underscored the importance of fine-tuning GPT-3 model to incorporate the context-specific sensitivity and the interconnectedness of textual cues that represent both IRFs. In this paper, we introduce an Interpretable Prompting (InterPrompt)} method to boost the attention mechanism by fine-tuning the GPT-3 model. This allows a more sophisticated level of language modification by adjusting the pre-trained weights. Our model learns to detect usual patterns and underlying connections across both the IRFs, which leads to better system-level explainability and trustworthiness. The results of our research demonstrate that all four variants of GPT-3 model, when fine-tuned with InterPrompt, perform considerably better as compared to the baseline methods, both in terms of classification and explanation generation.

gpt-3 model, interprompt, textual cue, (13 more...)

arXiv.org Artificial Intelligence

2311.12404

Country:

North America > United States > Minnesota > Olmsted County > Rochester (0.04)
Asia > China > Hong Kong (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.48)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Cabbage Sweeter than Cake? Analysing the Potential of Large Language Models for Learning Conceptual Spaces

Chatterjee, Usashi, Gajbhiye, Amit, Schockaert, Steven

arXiv.org Artificial IntelligenceOct-9-2023

The theory of Conceptual Spaces is an influential cognitive-linguistic framework for representing the meaning of concepts. Conceptual spaces are constructed from a set of quality dimensions, which essentially correspond to primitive perceptual features (e.g. hue or size). These quality dimensions are usually learned from human judgements, which means that applications of conceptual spaces tend to be limited to narrow domains (e.g. modelling colour or taste). Encouraged by recent findings about the ability of Large Language Models (LLMs) to learn perceptually grounded representations, we explore the potential of such models for learning conceptual spaces. Our experiments show that LLMs can indeed be used for learning meaningful representations to some extent. However, we also find that fine-tuned models of the BERT family are able to match or even outperform the largest GPT-3 model, despite being 2 to 3 orders of magnitude smaller.

dataset, probability, representation, (14 more...)

arXiv.org Artificial Intelligence

2310.05481

Country:

North America > United States > California > San Francisco County > San Francisco (0.04)
North America > Dominican Republic (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)

Add feedback

Exploring the Effectiveness of GPT Models in Test-Taking: A Case Study of the Driver's License Knowledge Test

Rahimi, Saba, Balch, Tucker, Veloso, Manuela

arXiv.org Artificial IntelligenceAug-22-2023

Large language models such as Open AI's Generative Pre-trained Transformer (GPT) models are proficient at answering questions, but their knowledge is confined to the information present in their training data. This limitation renders them ineffective when confronted with questions about recent developments or non-public documents. Our research proposes a method that enables GPT models to answer questions by employing context from an information source not previously included in their training data. The methodology includes preprocessing of contextual information, the embedding of contexts and queries, constructing prompt through the integration of context embeddings, and generating answers using GPT models. We applied this method in a controlled test scenario using the California Driver's Handbook as the information source. The GPT-3 model achieved a 96% passing score on a set of 50 sample driving knowledge test questions. In contrast, without context, the model's passing score fell to 82%. However, the model still fails to answer some questions correctly even with providing library of context, highlighting room for improvement. The research also examined the impact of prompt length and context format, on the model's performance. Overall, the study provides insights into the limitations and potential improvements for GPT models in question-answering tasks.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2308.11827

Country:

North America > United States > California (0.27)
Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > Experimental Study (0.54)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Infrastructure & Services (0.95)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DWReCO at CheckThat! 2023: Enhancing Subjectivity Detection through Style-based Data Sampling

Schlicht, Ipek Baris, Khellaf, Lynn, Altiok, Defne

arXiv.org Artificial IntelligenceJul-7-2023

This paper describes our submission for the subjectivity detection task at the CheckThat! Lab. To tackle class imbalances in the task, we have generated additional training materials with GPT-3 models using prompts of different styles from a subjectivity checklist based on journalistic perspective. We used the extended training set to fine-tune language-specific transformer models. Our experiments in English, German and Turkish demonstrate that different subjective styles are effective across all languages. In addition, we observe that the style-based oversampling is better than paraphrasing in Turkish and English. Lastly, the GPT-3 models sometimes produce lacklustre results when generating style-based texts in non-English languages.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2307.0355

Country:

Europe > Switzerland (0.04)
Europe > Greece > Central Macedonia > Thessaloniki (0.04)
Europe > Germany > Berlin (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Media > News (0.70)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback